Mixture Regression for Covariate Shift
نویسندگان
چکیده
In supervised learning there is a typical presumption that the training and test points are taken from the same distribution. In practice this assumption is commonly violated. The situations where the training and test data are from different distributions is called covariate shift. Recent work has examined techniques for dealing with covariate shift in terms of minimisation of generalisation error. As yet the literature lacks a Bayesian generative perspective on this problem. This paper tackles this issue for regression models. Recent work on covariate shift can be understood in terms of mixture regression. Using this view, we obtain a general approach to regression under covariate shift, which reproduces previous work as a special case. The main advantages of this new formulation over previous models for covariate shift are that we no longer need to presume the test and training densities are known, the regression and density estimation are combined into a single procedure, and previous methods are reproduced as special cases of this procedure, shedding light on the implicit assumptions the methods are making.
منابع مشابه
Mixture regression for observational data, with application to functional regression models
In a regression analysis, suppose we suspect that there are several heterogeneous groups in the population that a sample represents. Mixture regression models have been applied to address such problems. By modeling the conditional distribution of the response given the covariate as a mixture, the sample can be clustered into groups and the individual regression models for the groups can be esti...
متن کاملUnderstanding covariate shift in model performance
Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN's performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data...
متن کاملUnderstanding covariate shift in model performance [ version
Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data...
متن کاملBayesian mixture of splines for spatially adaptive nonparametric regression
A Bayesian approach is presented for spatially adaptive nonparametric regression where the regression function is modelled as a mixture of splines. Each component spline in the mixture has associated with it a smoothing parameter which is defined over a local region of the covariate space. These local regions overlap such that individual data points may lie simultaneously in multiple regions. C...
متن کاملRobust Covariate Shift Regression
In many learning settings, the source data available to train a regression model differs from the target data it encounters when making predictions due to input distribution shift. Appropriately dealing with this situation remains an important challenge. Existing methods attempt to “reweight” the source data samples to better represent the target domain, but this introduces strong inductive bia...
متن کامل